Search CORE

89 research outputs found

Noise Robust Blind System Identification Algorithms Based On A Rayleigh Quotient Cost Function

Author: Brookes D
Doclo S
Hu M
Naylor P
Sharma D
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2015
Field of study

Spiral - Imperial College Digital Repository

RTF-Based Binaural MVDR Beamformer Exploiting an External Microphone in a Diffuse Noise Field

Author: Doclo S.
Gößling N.
Publication venue
Publication date: 12/07/2018
Field of study

Besides suppressing all undesired sound sources, an important objective of a binaural noise reduction algorithm for hearing devices is the preservation of the binaural cues, aiming at preserving the spatial perception of the acoustic scene. A well-known binaural noise reduction algorithm is the binaural minimum variance distortionless response beamformer, which can be steered using the relative transfer function (RTF) vector of the desired source, relating the acoustic transfer functions between the desired source and all microphones to a reference microphone. In this paper, we propose a computationally efficient method to estimate the RTF vector in a diffuse noise field, requiring an additional microphone that is spatially separated from the head-mounted microphones. Assuming that the spatial coherence between the noise components in the head-mounted microphone signals and the additional microphone signal is zero, we show that an unbiased estimate of the RTF vector can be obtained. Based on real-world recordings, experimental results for several reverberation times show that the proposed RTF estimator outperforms the widely used RTF estimator based on covariance whitening and a simple biased RTF estimator in terms of noise reduction and binaural cue preservation performance.Comment: Accepted at ITG Conference on Speech Communication 201

arXiv.org e-Print Archive

Instrumental and perceptual evaluation of dereverberation techniques based on robust acoustic multichannel equalization

Author: Cauchi B.
Doclo S.
Goetze S.
Kodrasi I.
Publication venue: 'Audio Engineering Society'
Publication date: 01/01/2017
Field of study

Speech signals recorded in an enclosed space by microphones at a distance from the speaker are often corrupted by reverberation, which arises from the superposition of many delayed and attenuated copies of the source signal. Because reverberation degrades the signal, removing reverberation would enhance quality. Dereverberation techniques based on acoustic multichannel equalization are known to be sensitive to room impulse response perturbations. In order to increase robustness, several methods have been proposed, as for example, using a shorter reshaping filter length, incorporating regularization, or applying a sparsity-promoting penalty function. This paper focuses on evaluating the performance of these methods for single-source multi-microphone scenarios, using instrumental performance measures as well as using subjective listening tests. By analyzing the correlation between the instrumental and the perceptual results, it is shown that signal-based performance measures are more advantageous than channel-based performance measures to evaluate the perceptual speech quality of signals that were dereverberated by equalization techniques. Furthermore, this analysis also demonstrates the need to develop more reliable instrumental performance measures

Crossref

Fraunhofer-ePrints

White Rose Research Online

Square root-based multi-source early PSD estimation and recursive RETF update in reverberant environments by means of the orthogonal Procrustes problem

Author: Dietzen T.
Doclo S.
Moonen M.
van Waterschoot T.
Publication venue
Publication date: 18/06/2019
Field of study

Multi-channel short-time Fourier transform (STFT) domain-based processing of reverberant microphone signals commonly relies on power-spectral-density (PSD) estimates of early source images, where early refers to reflections contained within the same STFT frame. State-of-the-art approaches to multi-source early PSD estimation, given an estimate of the associated relative early transfer functions (RETFs), conventionally minimize the approximation error defined with respect to the early correlation matrix, requiring non-negative inequality constraints on the PSDs. Instead, we here propose to factorize the early correlation matrix and minimize the approximation error defined with respect to the early-correlation-matrix square root. The proposed minimization problem -- constituting a generalization of the so-called orthogonal Procrustes problem -- seeks a unitary matrix and the square roots of the early PSDs up to an arbitrary complex argument, making non-negative inequality constraints redundant. A solution is obtained iteratively, requiring one singular value decomposition (SVD) per iteration. The estimated unitary matrix and early PSD square roots further allow to recursively update the RETF estimate, which is not inherently possible in the conventional approach. An estimate of the said early-correlation-matrix square root itself is obtained by means of the generalized eigenvalue decomposition (GEVD), where we further propose to restore non-stationarities by desmoothing the generalized eigenvalues in order to compensate for inevitable recursive averaging. Simulation results indicate fast convergence of the proposed multi-source early PSD estimation approach in only one iteration if initialized appropriately, and better performance as compared to the conventional approach

arXiv.org e-Print Archive

Measuring, modelling and predicting perceived reverberation

Author: Cauchi B.
Doclo S.
Goetze S.
Javed H.A.
Naylor P.A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

This paper investigates the relationship between the perceived level of reverberation and parameters measured from the room impulse response (RIR), as well as the design of an instrumental measure that predicts this perceived level. We first present the results of an experimental listening test conducted to assess the level of perceived reverberation in speech captured by a single microphone, before analysing the gathered data to assess the influence of parameters such as the reverberation time (T60) or the direct-to-reverberant ratio (DRR). Secondly, we use the results of this analysis to improve the signal based reverberation decay tail (RDT) measure, previously proposed by the authors to predict the perceived level of reverberation. The accuracy of the proposed measure is evaluated in terms of correlation with the subjective scores and compared to the performance of predictors using parameters extracted from the RIR. Results show that the proposed modifications to the RDT does improve its accuracy. Though still slightly outperformed by measures based on parameters of the RIR, we believe the proposed measure to be useful in scenarios in which the RIR or its parameters are unknown

Crossref

Fraunhofer-ePrints

White Rose Research Online

Low-bandwidth binaural beamforming

Author: Doclo
Hamacher
S. Srinivasan
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2008
Field of study

Crossref

Non-intrusive speech quality prediction using modulation energies and LSTM-network

Author: Cauchi B.
Doclo S.
Falk T.H.
Goetze S.
Santos J.F.
Siedenburg K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2019
Field of study

Many signal processing algorithms have been proposed to improve the quality of speech recorded in the presence of noise and reverberation. Perceptual measures, i.e., listening tests, are usually considered the most reliable way to evaluate the quality of speech processed by such algorithms but are costly and time-consuming. Consequently, speech enhancement algorithms are often evaluated using signal-based measures, which can be either intrusive or non-intrusive. As the computation of intrusive measures requires a reference signal, only non-intrusive measures can be used in applications for which the clean speech signal is not available. However, many existing non-intrusive measures correlate poorly with the perceived speech quality, particularly when applied over a wide range of algorithms or acoustic conditions. In this paper, we propose a novel non-intrusive measure of the quality of processed speech that combines modulation energy features and a recurrent neural network using long short-term memory cells. We collected a dataset of perceptually evaluated signals representing several acoustic conditions and algorithms and used this dataset to train and evaluate the proposed measure. Results show that the proposed measure yields higher correlation with perceptual speech quality than that of benchmark intrusive and non-intrusive measures when considering various categories of algorithms. Although the proposed measure is sensitive to mismatch between training and testing, results show that it is a useful approach to evaluate specific algorithms over a wide range of acoustic conditions and may, thus, become particularly useful for real-time selection of speech enhancement algorithm settings

White Rose Research Online

Optimal Binaural LCMV Beamforming in Complex Acoustic Scenarios: Theoretical and Practical Insights

Author: Doclo S.
Gößling N.
Marquardt D.
Merks I.
Zhang T.
Publication venue
Publication date: 12/07/2018
Field of study

Binaural beamforming algorithms for head-mounted assistive listening devices are crucial to improve speech quality and speech intelligibility in noisy environments, while maintaining the spatial impression of the acoustic scene. While the well-known BMVDR beamformer is able to preserve the binaural cues of one desired source, the BLCMV beamformer uses additional constraints to also preserve the binaural cues of interfering sources. In this paper, we provide theoretical and practical insights on how to optimally set the interference scaling parameters in the BLCMV beamformer for an arbitrary number of interfering sources. In addition, since in practice only a limited temporal observation interval is available to estimate all required beamformer quantities, we provide an experimental evaluation in a complex acoustic scenario using measured impulse responses from hearing aids in a cafeteria for different observation intervals. The results show that even rather short observation intervals are sufficient to achieve a decent noise reduction performance and that a proposed threshold on the optimal interference scaling parameters leads to smaller binaural cue errors in practice.Comment: To appear in Proc. IWAENC 201

arXiv.org e-Print Archive

Phase reference for the generalized multichannel Wiener filter

Author: C Knapp
EAP Benesty
EAP Habets
I Cohen
I Kodrasi
J Chen
J Chen
J Chen
J Freudenberger
J Schmalenstroeer
JB Allen
L Wang
M Schwab
MR Schroeder
MS Brandstein
PA Naylor
R Stewart
S Doclo
S Doclo
S Doclo
S Doclo
S Gannot
S Markovich-Golan
S Markovich-Golan
S Markovich-Golan
S Miyabe
TC Lawin-Ore
TC Lawin-Ore
TG Dvorkind
TG Manickam
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Stereophonic hands-free communication system based on microphone array fixed beamforming: real-time implementation and evaluation

Author: A Gilloire
C Wun
E Ferrara
F Bettarelli
Francesco Piazza
H Buchner
H Chen
HS Malvar
J Benesty
J Benesty
J Benesty
J Benesty
J Herre
JA Swets
JJ Shynk
L Gabrielli
L Romoli
Laura Romoli
M Ali
M Brandstein
M Kallinger
M Pirro
MA Iqbal
Matteo Pirro
MMS Doclo
N Tashev
P Oak
S Doclo
S Haykin
SL Gay
Stefano Squartini
T Fawcett
W Chen
W Herbordt
W Hoeg
W Kellerman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref